cluster constraint
Inexact Column Generation for Bayesian Network Structure Learning via Difference-of-Submodular Optimization
In this paper, we consider a score-based Integer Programming (IP) approach for solving the Bayesian Network Structure Learning (BNSL) problem. State-of-the-art BNSL IP formulations suffer from the exponentially large number of variables and constraints. A standard approach in IP to address such challenges is to employ row and column generation techniques, which dynamically generate rows and columns, while the complex pricing problem remains a computational bottleneck for BNSL. For the general class of $\ell_0$-penalized likelihood scores, we show how the pricing problem can be reformulated as a difference of submodular optimization problem, and how the Difference of Convex Algorithm (DCA) can be applied as an inexact method to efficiently solve the pricing problems. Empirically, we show that, for continuous Gaussian data, our row and column generation approach yields solutions with higher quality than state-of-the-art score-based approaches, especially when the graph density increases, and achieves comparable performance against benchmark constraint-based and hybrid approaches, even when the graph size increases.
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- (4 more...)
Large Language Model (LLM) for Standard Cell Layout Design Optimization
Standard cells are essential components of modern digital circuit designs. With process technologies advancing toward 2nm, more routability issues have arisen due to the decreasing number of routing tracks, increasing number and complexity of design rules, and strict patterning rules. The state-of-the-art standard cell design automation framework is able to automatically design standard cell layouts in advanced nodes, but it is still struggling to generate highly competitive Performance-Power-Area (PPA) and routable cell layouts for complex sequential cell designs. Consequently, a novel and efficient methodology incorporating the expertise of experienced human designers to incrementally optimize the PPA of cell layouts is highly necessary and essential. High-quality device clustering, with consideration of netlist topology, diffusion sharing/break and routability in the layouts, can reduce complexity and assist in finding highly competitive PPA, and routable layouts faster. In this paper, we leverage the natural language and reasoning ability of Large Language Model (LLM) to generate high-quality cluster constraints incrementally to optimize the cell layout PPA and debug the routability with ReAct prompting. On a benchmark of sequential standard cells in 2nm, we demonstrate that the proposed method not only achieves up to 19.4% smaller cell area, but also generates 23.5% more LVS/DRC clean cell layouts than previous work. In summary, the proposed method not only successfully reduces cell area by 4.65% on average, but also is able to fix routability in the cell layout designs.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
Interactive Visual Data Exploration with Subjective Feedback: An Information-Theoretic Approach
Puolamäki, Kai, Oikarinen, Emilia, Kang, Bo, Lijffijt, Jefrey, De Bie, Tijl
Visual exploration of high-dimensional real-valued datasets is a fundamental task in exploratory data analysis (EDA). Existing methods use predefined criteria to choose the representation of data. There is a lack of methods that (i) elicit from the user what she has learned from the data and (ii) show patterns that she does not know yet. We construct a theoretical model where identified patterns can be input as knowledge to the system. The knowledge syntax here is intuitive, such as "this set of points forms a cluster", and requires no knowledge of maths. This background knowledge is used to find a Maximum Entropy distribution of the data, after which the system provides the user data projections in which the data and the Maximum Entropy distribution differ the most, hence showing the user aspects of the data that are maximally informative given the user's current knowledge. We provide an open source EDA system with tailored interactive visualizations to demonstrate these concepts. We study the performance of the system and present use cases on both synthetic and real data. We find that the model and the prototype system allow the user to learn information efficiently from various data sources and the system works sufficiently fast in practice. We conclude that the information theoretic approach to exploratory data analysis where patterns observed by a user are formalized as constraints provides a principled, intuitive, and efficient basis for constructing an EDA system.
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- (2 more...)
Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets and Complexity
Cussens, James, Järvisalo, Matti, Korhonen, Janne H., Bartlett, Mark
The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks - arguably the most central class of graphical models - especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Europe > United Kingdom > England > North Yorkshire > York (0.04)
- Europe > Iceland > Capital Region > Reykjavik (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets, and Complexity
Cussens, James, Järvisalo, Matti, Korhonen, Janne H., Bartlett, Mark
The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks---arguably the most central class of graphical models---especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem( is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Europe > United Kingdom > England > North Yorkshire > York (0.04)
- Europe > Iceland > Capital Region > Reykjavik (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Exact Maximum Margin Structure Learning of Bayesian Networks
Peharz, Robert, Pernkopf, Franz
Recently, there has been much interest in finding globally optimal Bayesian network structures. These techniques were developed for generative scores and can not be directly extended to discriminative scores, as desired for classification. In this paper, we propose an exact method for finding network structures maximizing the probabilistic soft margin, a successfully applied discriminative score. Our method is based on branch-and-bound techniques within a linear programming framework and maintains an any-time solution, together with worst-case sub-optimality bounds. We apply a set of order constraints for enforcing the network structure to be acyclic, which allows a compact problem representation and the use of general-purpose optimization techniques. In classification experiments, our methods clearly outperform generatively trained network structures and compete with support vector machines.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > New York (0.04)
- (3 more...)